6 research outputs found
When can the two-armed bandit algorithm be trusted?
We investigate the asymptotic behavior of one version of the so-called
two-armed bandit algorithm. It is an example of stochastic approximation
procedure whose associated ODE has both a repulsive and an attractive
equilibrium, at which the procedure is noiseless. We show that if the gain
parameter is constant or goes to 0 not too fast, the algorithm does fall in the
noiseless repulsive equilibrium with positive probability, whereas it always
converges to its natural attractive target when the gain parameter goes to zero
at some appropriate rates depending on the parameters of the model. We also
elucidate the behavior of the constant step algorithm when the step goes to 0.
Finally, we highlight the connection between the algorithm and the
Polya urn. An application to asset allocation is briefly described
Generalized Urn Models of Evolutionary Processes
Generalized Polya urn models can describe the dynamics of finite populations
of interacting genotypes. Three basic questions these models can address are:
Under what conditions does a population exhibit growth? On the event of growth,
at what rate does the population increase? What is the long-term behavior of
the distribution of genotypes? To address these questions, we associate a mean
limit ordinary differential equation (ODE) with the urn model. Previously, it
has been shown that on the event of population growth, the limiting
distribution of genotypes is a connected internally chain recurrent set for the
mean limit ODE. To determine when growth and convergence occurs with positive
probability, we prove two results. First, if the mean limit ODE has an
``attainable'' attractor at which growth is expected, then growth and
convergence toward this attractor occurs with positive probability. Second, the
population distribution almost surely does not converge to sets where growth is
not expecte
Edge-reinforced random walk, Vertex-Reinforced Jump Process and the supersymmetric hyperbolic sigma model
Edge-reinforced random walk (ERRW), introduced by Coppersmith and Diaconis in
1986, is a random process, which takes values in the vertex set of a graph ,
and is more likely to cross edges it has visited before. We show that it can be
represented in terms of a Vertex-reinforced jump process (VRJP) with
independent gamma conductances: the VRJP was conceived by Werner and first
studied by Davis and Volkov (2002,2004), and is a continuous-time process
favouring sites with more local time. We calculate, for any finite graph ,
the limiting measure of the centred occupation time measure of VRJP, and
interpret it as a supersymmetric hyperbolic sigma model in quantum field
theory, introduced by Zirnbauer (1991). This enables us to deduce that VRJP and
ERRW are positive recurrent in any dimension for large reinforcement, and that
VRJP is transient in dimension greater than or equal to 3 for small
reinforcement, using results of Disertori and Spencer (2010), Disertori,
Spencer and Zirnbauer (2010).Comment: 23 pages, 1 figur
Prospective individual patient data meta-analysis of two randomized trials on convalescent plasma for COVID-19 outpatients
Data on convalescent plasma (CP) treatment in COVID-19 outpatients are scarce. We aimed to assess whether CP administered during the first week of symptoms reduced the disease progression or risk of hospitalization of outpatients. Two multicenter, double-blind randomized trials (NCT04621123, NCT04589949) were merged with data pooling starting when = 50 years and symptomatic for <= 7days were included. The intervention consisted of 200-300mL of CP with a predefined minimum level of antibodies. Primary endpoints were a 5-point disease severity scale and a composite of hospitalization or death by 28 days. Amongst the 797 patients included, 390 received CP and 392 placebo; they had a median age of 58 years, 1 comorbidity, 5 days symptoms and 93% had negative IgG antibody-test. Seventy-four patients were hospitalized, 6 required mechanical ventilation and 3 died. The odds ratio (OR) of CP for improved disease severity scale was 0.936 (credible interval (CI) 0.667-1.311); OR for hospitalization or death was 0.919 (CI 0.592-1.416). CP effect on hospital admission or death was largest in patients with <= 5 days of symptoms (OR 0.658, 95%CI 0.394-1.085). CP did not decrease the time to full symptom resolution
Pièges des algorithmes stochastiques et marches aléatoires renforcées par sommets
CACHAN-ENS (940162301) / SudocSudocFranceF
When can the two-armed bandit algorithm be trusted ?
SIGLEAvailable from INIST (FR), Document Supply Service, under shelf-number : 22522, issue : a.2002 n.14 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc